ExaML version 3: a tool for phylogenomic analyses on supercomputers
نویسندگان
چکیده
MOTIVATION Phylogenies are increasingly used in all fields of medical and biological research. Because of the next generation sequencing revolution, datasets used for conducting phylogenetic analyses grow at an unprecedented pace. We present ExaML version 3, a dedicated production-level code for inferring phylogenies on whole-transcriptome and whole-genome alignments using supercomputers. RESULTS We introduce several improvements and extensions to ExaML: Extensions of substitution models and supported data types, the integration of a novel load balance algorithm as well as a parallel I/O optimization that significantly improve parallel efficiency, and a production-level implementation for Intel MIC-based hardware platforms.
منابع مشابه
Evaluating Fast Maximum Likelihood-Based Phylogenetic Programs Using Empirical Phylogenomic Data Sets
The sizes of the data matrices assembled to resolve branches of the tree of life have increased dramatically, motivating the development of programs for fast, yet accurate, inference. For example, several different fast programs have been developed in the very popular maximum likelihood framework, including RAxML/ExaML, PhyML, IQ-TREE, and FastTree. Although these programs are widely used, a sy...
متن کاملPhylogenomic analyses data of the avian phylogenomics project
BACKGROUND Determining the evolutionary relationships among the major lineages of extant birds has been one of the biggest challenges in systematic biology. To address this challenge, we assembled or collected the genomes of 48 avian species spanning most orders of birds, including all Neognathae and two of the five Palaeognathae orders. We used these genomes to construct a genome-scale avian p...
متن کاملTime and memory efficient likelihood-based tree searches on phylogenomic alignments with missing data
MOTIVATION The current molecular data explosion poses new challenges for large-scale phylogenomic analyses that can comprise hundreds or even thousands of genes. A property that characterizes phylogenomic datasets is that they tend to be gappy, i.e. can contain taxa with (many and disparate) missing genes. In current phylogenomic analyses, this type of alignment gappyness that is induced by mis...
متن کاملEfficient Computation of the Phylogenetic Likelihood Function on the Intel MIC Architecture
Phylogenetic inference is the process of reconstructing the evolutionary history of species based on their traits, nowadays mostly using molecular sequence data. Current state-of-the-art inference methods, like Bayesian and Maximum Likelihood (ML) inference, rely on the Phylogenetic Likelihood Function (PLF) as their computational core. Due to the large number of floating-point operations invol...
متن کاملDevelopment of a New Vernacular Tool for Diagnosis of Alcohol Dependence in the Emergency
Background: Alcohol dependence (AD) is a major reason for morbidity and visits to emergency medical settings. However, the detection of AD is often difficult or overlooked. This study aimed to develop a brief screening questionnaire in Hindi language for detection of AD in an emergency medical setting. Methods: The authors in consultation devised a set of questions related to AD in the Hindi l...
متن کامل